Improved Alignment Models for Statistical Machine Translation

نویسندگان

Franz Josef Och

Christoph Tillmann

Hermann Ney

چکیده

In this paper, we describe improved alignment models for statistical machine translation. The statistical translation approach uses two types of information: a translation model and a language model. The language model used is a bigram or general m-gram model. The translation model is decomposed into a lexical and an alignment model. We describe two different approaches for statistical translation and present experimental results. The first approach is based on dependencies between single words, the second approach explicitly takes shallow phrase structures into account, using two different alignment levels: a phrase level alignment between phrases and a word level alignment between single words. We present results using the Verbmobil task (German-English, 6000word vocabulary) which is a limited-domain spoken-language task. The experimental tests were performed on both the text transcription and the speech recognizer output. 1 S t a t i s t i c a l M a c h i n e T r a n s l a t i o n The goal of machine translation is the translation of a text given in some source language into a target language. We are given a source string f / = fl...fj...fJ, which is to be translated into a target string e{ = el...ei...ex. Among all possible target strings, we will choose the string with the highest probability: = argmax {Pr(ezIlflJ)}

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Discriminative Bilingual Word Alignment

For many years, statistical machine translation relied on generative models to provide bilingual word alignments. In 2005, several independent efforts showed that discriminative models could be used to enhance or replace the standard generative approach. Building on this work, we demonstrate substantial improvement in word-alignment accuracy, partly though improved training methods, but predomi...

متن کامل

A Source-side Decoding Sequence Model for Statistical Machine Translation

We propose a source-side decoding sequence language model for phrase-based statistical machine translation. This model is a reordering model in the sense that it helps the decoder find the correct decoding sequence. The model uses word-aligned bilingual training data. We show improved translation quality of up to 1.34% BLEU and 0.54% TER using this model compared to three other widely used reor...

متن کامل

Extensions to HMM-based Statistical Word Alignment Models

This paper describes improved HMM-based word level alignment models for statistical machine translation. We present a method for using part of speech tag information to improve alignment accuracy, and an approach to modeling fertility and correspondence to the empty word in an HMM alignment model. We present accuracy results from evaluating Viterbi alignments against human-judged alignments on ...

متن کامل

Extentions to HMM-based Statistical Word Alignment Models

متن کامل

Refining Kazakh Word Alignment Using Simulation Modeling Methods for Statistical Machine Translation

Word alignment play an important role in the training of statistical machine translation systems. We present a technique to refine word alignments at phrase level after the collection of sentences from the Kazakh-English parallel corpora. The estimation technique extracts the phrase pairs from the word alignment and then incorporates them into the translation system for further steps. Although ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Improved Alignment Models for Statistical Machine Translation

نویسندگان

چکیده

منابع مشابه

Improved Discriminative Bilingual Word Alignment

A Source-side Decoding Sequence Model for Statistical Machine Translation

Extensions to HMM-based Statistical Word Alignment Models

Extentions to HMM-based Statistical Word Alignment Models

Refining Kazakh Word Alignment Using Simulation Modeling Methods for Statistical Machine Translation

عنوان ژورنال:

اشتراک گذاری